NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Conformal Prediction for Network-Assisted Regression

https://doi.org/10.1080/01621459.2025.2506198

Lunde, Robert; Levina, Elizaveta; Zhu, Ji (July 2025, Journal of the American Statistical Association)

Free, publicly-accessible full text available July 23, 2026
High-dimensional Factor Analysis for Network-linked data

https://doi.org/10.1093/biomet/asaf012

Li, Jinming; Xu, Gongjun; Zhu, Ji (February 2025, Biometrika)

Abstract Factor analysis is a widely used statistical tool in many scientific disciplines, such as psychology, economics, and sociology. As observations linked by networks become increasingly common, incorporating network structures into factor analysis remains an open problem. In this paper, we focus on high-dimensional factor analysis involving network-connected observations, and propose a generalized factor model with latent factors that account for both the network structure and the dependence structure among high-dimensional variables. These latent factors can be shared by the high-dimensional variables and the network, or exclusively applied to either of them. We develop a computationally efficient estimation procedure and establish asymptotic inferential theories. Notably, we show that by borrowing information from the network, the proposed estimator of the factor loading matrix achieves optimal asymptotic variance under much milder identifiability constraints than existing literature. Furthermore, we develop a hypothesis testing procedure to tackle the challenge of discerning the shared and individual latent factors’ structure. The finite sample performance of the proposed method is demonstrated through simulation studies and a real-world dataset involving a statistician co-authorship network.
more » « less
Free, publicly-accessible full text available February 21, 2026
Variational Estimators of the Degree-corrected Latent Block Model for Bipartite Networks

Zhao, Yunpeng; Hao, Ning; Zhu, Ji (May 2024, Journal of machine learning research)

Bipartite graphs are ubiquitous across various scientific and engineering fields. Simultaneously grouping the two types of nodes in a bipartite graph via biclustering represents a fundamental challenge in network analysis for such graphs. The latent block model (LBM) is a commonly used model-based tool for biclustering. However, the effectiveness of the LBM is often limited by the influence of row and column sums in the data matrix. To address this limitation, we introduce the degree-corrected latent block model (DC-LBM), which accounts for the varying degrees in row and column clusters, significantly enhancing performance on real-world data sets and simulated data. We develop an efficient variational expectation-maximization algorithm by creating closed-form solutions for parameter estimates in the M steps. Furthermore, we prove the label consistency and the rate of convergence of the variational estimator under the DC-LBM, allowing the expected graph density to approach zero as long as the average expected degrees of rows and columns approach infinity when the size of the graph increases.
more » « less
Full Text Available
Variational Estimators of the Degree-corrected Latent Block Model for Bipartite Networks

Zhao, Yunpeng; Hao, Ning; Zhu, Ji (May 2024, Journal of machine learning research)
Shen, Xiaotong (Ed.)
Full Text Available
Statistics and AI: A Fireside Conversation

https://doi.org/10.1162/99608f92.c066fe9c

Lin, Xihong; Cai, Tianxi; Donoho, David; Fu, Haoda; Ke, Tracy; Jin, Jiashun; Meng, Xiao-Li; Qu, Annie; Shi, Chengchun; Song, Peter; et al (January 2025, Harvard data science review)

Full Text Available
Community models for networks observed through edge nominations

Li, Tianxi; Levina, Elizaveta; Zhu, Ji (October 2023, Journal of machine learning research)

Communities are a common and widely studied structure in networks, typically assum- ing that the network is fully and correctly observed. In practice, network data are often collected by querying nodes about their connections. In some settings, all edges of a sam- pled node will be recorded, and in others, a node may be asked to name its connections. These sampling mechanisms introduce noise and bias, which can obscure the community structure and invalidate assumptions underlying standard community detection methods. We propose a general model for a class of network sampling mechanisms based on recording edges via querying nodes, designed to improve community detection for network data col- lected in this fashion. We model edge sampling probabilities as a function of both individual preferences and community parameters, and show community detection can be performed by spectral clustering under this general class of models. We also propose, as a special case of the general framework, a parametric model for directed networks we call the nomination stochastic block model, which allows for meaningful parameter interpretations and can be fitted by the method of moments. In this case, spectral clustering and the method of mo- ments are computationally ecient and come with theoretical guarantees of consistency. We evaluate the proposed model in simulation studies on unweighted and weighted net- works and under misspecified models. The method is applied to a faculty hiring dataset, discovering a meaningful hierarchy of communities among US business schools.
more » « less
Full Text Available
Link Prediction for Egocentrically Sampled Networks

https://doi.org/10.1080/10618600.2022.2163648

Li, Tianxi; Wu, Yun-Jhong; Levina, Elizaveta; Zhu, Ji (January 2023, Journal of computational and graphical statistics)

Full Text Available
Joint latent space models for network data with high-dimensional node variables

https://doi.org/10.1093/biomet/asab063

Zhang, Xuefei; Xu, Gongjun; Zhu, Ji (December 2021, Biometrika)

Summary Network latent space models assume that each node is associated with an unobserved latent position in a Euclidean space, and such latent variables determine the probability of two nodes connecting with each other. In many applications, nodes in the network are often observed along with high-dimensional node variables, and these node variables provide important information for understanding the network structure. However, classical network latent space models have several limitations in incorporating node variables. In this paper, we propose a joint latent space model where we assume that the latent variables not only explain the network structure, but are also informative for the multivariate node variables. We develop a projected gradient descent algorithm that estimates the latent positions using a criterion incorporating both network structure and node variables. We establish theoretical properties of the estimators and provide insights into how incorporating high-dimensional node variables could improve the estimation accuracy of the latent positions. We demonstrate the improvement in latent variable estimation and the improvements in associated downstream tasks, such as missing value imputation for node variables, by simulation studies and an application to a Facebook data example.
more » « less
Full Text Available
Survival Analysis via Ordinary Differential Equations

https://doi.org/10.1080/01621459.2022.2051519

Tang, Weijing; He, Kevin; Xu, Gongjun; Zhu, Ji (January 2022, Journal of the American Statistical Association)

Full Text Available
MuSP: A multistep screening procedure for sparse recovery

https://doi.org/10.1002/sta4.352

Yang, Yuehan; Zhu, Ji; George, Edward I. (December 2021, Stat)

Full Text Available

« Prev Next »

Search for: All records